Applied Mathematics for Class 11th & 12th (Concepts and Questions)
11th	Concepts	Questions
12th	Concepts	Questions

Applied Maths Class 11th Chapters (Concepts)
1. Numbers and Quantification	2. Numbers Applications	3. Sets
4. Relations	5. Sequences and Series	6. Permutations and Combinations
7. Mathematical Reasoning	8. Calculus	9. Probability
10. Descriptive Statistics	11. Financial Mathematics	12. Coordinate Geometry

Content On This Page
Introduction	Random Experiment and Sample Space	Event
Conditional Probability	Total Probability	Bayes’ Theorem

Chapter 9 Probability (Concepts)

Welcome to this foundational chapter on Probability Theory, an indispensable tool within Applied Mathematics for analyzing uncertainty and making informed decisions in the face of randomness. Probability allows us to quantify likelihood, moving from vague notions of chance to precise mathematical models. Its principles underpin numerous fields, including risk analysis in finance and insurance, statistical inference in data science, quality control in manufacturing, modeling in physics and biology, and strategic decision-making in business and operations research. This chapter aims to equip you with the fundamental language, axioms, and techniques needed to calculate and interpret probabilities effectively, focusing on the rigorous axiomatic approach and its practical applications.

We begin by establishing the core terminology used throughout probability theory. A random experiment is any process with an uncertain outcome (e.g., tossing a coin, rolling a die, drawing a card). Each possible result of an experiment is an outcome. The set of all possible outcomes constitutes the Sample Space, denoted by $S$. Any subset of the sample space is called an Event ($E$), representing a specific result or collection of results of interest. Understanding these basic definitions is crucial for framing probability problems correctly. We further classify events:

The Impossible Event ($\emptyset$): An event that cannot occur (contains no outcomes).
The Sure Event ($S$): An event that is certain to occur (contains all outcomes).
A Simple (or Elementary) Event: An event consisting of a single outcome.
A Compound Event: An event consisting of more than one outcome.
The Complementary Event ($E'$ or $E^c$): The event that $E$ does not occur (all outcomes in $S$ but not in $E$).
Mutually Exclusive Events: Two or more events that cannot happen simultaneously (their intersection is empty, $E \cap F = \emptyset$).
Exhaustive Events: A set of events whose union covers the entire sample space.

While intuitive notions of probability exist, the modern theory rests upon the Axiomatic Definition proposed by Kolmogorov. Probability ($P$) is defined as a function that assigns a real number $P(E)$ to each event $E$ in the sample space $S$, satisfying three fundamental axioms:

Non-negativity: For any event $E$, $P(E) \ge 0$.
Normalization: The probability of the sure event is 1, i.e., $P(S) = 1$.
Additivity: If $E$ and $F$ are mutually exclusive events, then $P(E \cup F) = P(E) + P(F)$.

From these simple axioms, several important consequences can be derived, including $P(\emptyset) = 0$, $P(E') = 1 - P(E)$, and $0 \le P(E) \le 1$. The axiomatic approach provides a consistent and rigorous foundation. For experiments where all outcomes are considered equally likely (like fair coin tosses or dice rolls), the probability of an event $E$ simplifies to the classical definition: $P(E) = \frac{\text{Number of outcomes favorable to } E}{\text{Total number of possible outcomes}} = \frac{n(E)}{n(S)}$. Calculating $n(E)$ and $n(S)$ often requires employing counting techniques from Permutations and Combinations (using formulas like $\binom{n}{r}$ or $P(n,r)$).

Building upon the axioms, we explore key probability calculations. The Addition Rule for any two events (not necessarily mutually exclusive) is given by $P(E \cup F) = P(E) + P(F) - P(E \cap F)$. This formula accounts for outcomes counted twice if the events overlap. We introduce Conditional Probability, $P(A|B)$, representing the probability of event $A$ occurring given that event $B$ has already occurred. It is defined as $P(A|B) = \frac{P(A \cap B)}{P(B)}$, provided $P(B) \neq 0$. This leads to the Multiplication Rule of Probability: $P(A \cap B) = P(A)P(B|A) = P(B)P(A|B)$, essential for calculating probabilities of sequential events. Finally, we define Independent Events as events where the occurrence of one does not influence the probability of the other. Mathematically, $A$ and $B$ are independent if $P(A \cap B) = P(A)P(B)$. Understanding these concepts and formulas allows us to tackle a wide range of probability problems involving dice, coins, cards, selections, and other random phenomena critical to applied analysis.

Introduction

Probability is a fundamental branch of mathematics that deals with the study of chance and uncertainty. In our daily lives, we constantly encounter situations where the outcome is not predetermined or certain. From predicting the weather to deciding whether to carry an umbrella, or assessing the likelihood of getting a specific card in a game, we are informally dealing with concepts of probability.

Understanding Uncertainty

Life is full of uncertainty. We often express this uncertainty using qualitative terms such as "likely," "unlikely," "possible," "impossible," "certain," "probably," or "maybe." For example:

"It is likely to rain this evening."
"It is impossible for a person to live without air."
"It is certain that the sun will rise tomorrow."

While these terms give us a general idea, they lack precision. Probability provides a quantitative way to measure this uncertainty, assigning a numerical value to the likelihood of an event occurring. This numerical value allows for more rigorous analysis, comparison, and decision-making in uncertain environments.

The Genesis of Probability

The formal study of probability originated in the 16th and 17th centuries, primarily motivated by problems related to games of chance. Italian mathematician Gerolamo Cardano (1501-1576) wrote one of the earliest works on probability, focusing on outcomes in dice games. Later, in the mid-17th century, a famous correspondence between French mathematicians Pierre de Fermat (1601-1665) and Blaise Pascal (1623-1662), prompted by a question about dividing stakes in an interrupted game of chance, laid the foundation for the mathematical theory of probability. Jacob Bernoulli (1654-1705), Pierre-Simon Laplace (1749-1827), and others further developed the theory.

Initially confined to gambling, the applications of probability theory rapidly expanded to various fields, including:

Science: Quantum mechanics, genetics, thermodynamics.
Engineering: Reliability analysis, signal processing.
Finance and Economics: Risk assessment, stock market prediction, insurance.
Statistics: Data analysis, inference, hypothesis testing.
Artificial Intelligence: Machine learning, Bayesian networks.

Today, probability is an indispensable tool in almost every area where uncertainty plays a role.

Quantifying Probability: The Probability Scale

The probability of an event is a numerical value that ranges from 0 to 1, inclusive.

A probability of 0 means the event is impossible. It will never happen under the given conditions. For example, the probability of rolling a 7 with a single standard six-sided die is 0.
A probability of 1 means the event is certain. It will definitely happen under the given conditions. For example, the probability of getting heads or tails when flipping a fair coin is 1.
A probability between 0 and 1 indicates the likelihood of the event occurring. A value closer to 1 indicates a higher likelihood, while a value closer to 0 indicates a lower likelihood. For instance, a probability of $0.75$ (or $75\%$) suggests that the event is quite likely to happen, while a probability of $0.10$ (or $10\%$) suggests it is unlikely.

Probabilities can be expressed as fractions (e.g., $\frac{1}{2}$), decimals (e.g., $0.5$), or percentages (e.g., $50\%$).

Approaches to Probability

There are different perspectives or approaches to understanding and calculating probability:

1. Classical Approach (or Priori Probability):

This approach is applicable when all possible outcomes of an experiment are equally likely and mutually exclusive. The probability of an event E is defined as the ratio of the number of outcomes favourable to E to the total number of possible outcomes.

If $n(E)$ is the number of outcomes favourable to event E, and $n(S)$ is the total number of possible outcomes in the sample space S, then the probability of event E is:

$\text{P}(E) = \frac{\text{Number of favourable outcomes}}{\text{Total number of possible outcomes}} = \frac{n(E)}{n(S)}$

... (i)

Example: The probability of getting a head when flipping a fair coin is $\frac{1}{2}$ because there is 1 favourable outcome (Head) out of 2 equally likely possible outcomes (Head, Tail). The probability of rolling an even number on a standard six-sided die is $\frac{3}{6} = \frac{1}{2}$ because there are 3 favourable outcomes (2, 4, 6) out of 6 equally likely possible outcomes (1, 2, 3, 4, 5, 6).

2. Empirical Approach (or Posterior Probability/Relative Frequency Approach):

This approach is based on the results of actual experiments or observations. The probability of an event is defined as the relative frequency of its occurrence in a large number of trials.

If an experiment is conducted $N$ times, and an event E occurs $n(E)$ times, then the empirical probability of event E is:

$\text{P}(E) = \frac{\text{Number of times event E occurred}}{\text{Total number of trials}} = \frac{n(E)}{N}$

... (ii)

As the number of trials ($N$) increases, the empirical probability tends to approach the classical (or true) probability. This is related to the Law of Large Numbers.

Example: If a company produces 1000 bulbs and finds 5 defective bulbs, the empirical probability of a bulb being defective is $\frac{5}{1000} = 0.005$. This is widely used in quality control, insurance (actuarial science), and scientific experiments.

3. Subjective Approach:

This approach defines probability based on an individual's personal judgment, beliefs, or experience, especially when there is no objective way to determine equally likely outcomes or conduct repeated trials. Subjective probabilities are often used in fields where unique events or incomplete information exist.

Example: A financial analyst might assign a $70\%$ probability to a particular stock increasing in value based on their analysis of market conditions and company performance. A doctor might estimate a patient's probability of recovery based on their medical knowledge and the patient's condition.

In this chapter, we will primarily delve into the Classical and Axiomatic approaches to probability, building a solid foundation for understanding more complex concepts.

Random Experiment and Sample Space

In the study of probability, the first step is to clearly define the experiment we are performing and list all its possible outcomes. This brings us to the crucial concepts of a random experiment and its associated sample space.

Random Experiment

A Random Experiment is a process or action whose outcome cannot be predicted with certainty beforehand, but the set of all possible outcomes is known. It is characterized by the element of chance.

Key Characteristics of a Random Experiment:

The outcome of the experiment is uncertain. Before performing the experiment, we cannot say exactly what the result will be.
The set of all possible outcomes of the experiment is known in advance.
The experiment can be repeated any number of times under essentially the same conditions, and each performance is called a trial.
There is no definite pattern to the outcomes; they appear to occur randomly.

Examples of Random Experiments:

Tossing a fair coin: We know the outcome will be either a Head (H) or a Tail (T), but we cannot predict which specific outcome will occur before the toss.
Rolling a standard six-sided die: The outcomes are the numbers 1, 2, 3, 4, 5, or 6. We know these are the only possibilities, but which number appears on the top face is uncertain before rolling.
Drawing a card from a well-shuffled deck of 52 cards: Any of the 52 distinct cards could be drawn, but which one is drawn is unknown beforehand.
Selecting a student from a class and recording their height: The height will be a positive value within a certain range, but the exact height of the selected student is uncertain before measurement.

Sample Space

The set of all possible outcomes of a random experiment is called the Sample Space. It is usually denoted by the capital letter $S$ or $\Omega$. Each element in the sample space is called a sample point or an outcome.

The sample space provides a complete list of everything that can happen when a random experiment is performed.

Examples of Sample Spaces:

For tossing a fair coin: The possible outcomes are Head (H) and Tail (T). The sample space is $S = \{H, T\}$. The number of sample points is $n(S) = 2$.
For rolling a standard six-sided die: The possible outcomes are the numbers 1, 2, 3, 4, 5, 6. The sample space is $S = \{1, 2, 3, 4, 5, 6\}$. The number of sample points is $n(S) = 6$.
For tossing two coins simultaneously: Let's list the possible outcomes systematically. For the first coin, we can get H or T. For the second coin, we can also get H or T. The combinations are:
First coin is H, Second coin is H: (H, H)

First coin is H, Second coin is T: (H, T)

First coin is T, Second coin is H: (T, H)

First coin is T, Second coin is T: (T, T)

The sample space is the set of these ordered pairs: $S = \{(H, H), (H, T), (T, H), (T, T)\}$. The number of sample points is $n(S) = 4 = 2 \times 2$. This is consistent with the multiplication principle for counting outcomes when there are multiple stages in the experiment.
For rolling two standard six-sided dice: Each die can result in 1, 2, 3, 4, 5, or 6. When rolling two dice, the outcome is an ordered pair $(d_1, d_2)$, where $d_1$ is the result on the first die and $d_2$ is the result on the second die. The total number of possible outcomes is $6 \times 6 = 36$. The sample space is:
$S = \{(1,1), (1,2), (1,3), (1,4), (1,5), (1,6),$
$\phantom{S = \{} (2,1), (2,2), (2,3), (2,4), (2,5), (2,6),$
$\phantom{S = \{} (3,1), (3,2), (3,3), (3,4), (3,5), (3,6),$
$\phantom{S = \{} (4,1), (4,2), (4,3), (4,4), (4,5), (4,6),$
$\phantom{S = \{} (5,1), (5,2), (5,3), (5,4), (5,5), (5,6),$
$\phantom{S = \{} (6,1), (6,2), (6,3), (6,4), (6,5), (6,6)\}$
The number of sample points is $n(S) = 36$. Note that $(1,2)$ and $(2,1)$ are distinct outcomes when rolling two dice.

Types of Sample Spaces: Finite and Infinite

Sample spaces can be classified based on the number of outcomes they contain:

1. Finite Sample Space:

A sample space is called finite if the number of sample points in it is limited or countable.
Examples: Tossing a coin ($S=\{H, T\}$, $n(S)=2$), rolling a die ($S=\{1, 2, 3, 4, 5, 6\}$, $n(S)=6$), drawing a card from a deck ($n(S)=52$), selecting 3 students from a class of 30 (the number of combinations is finite).

2. Infinite Sample Space:

A sample space is called infinite if the number of sample points in it is unlimited.
Infinite sample spaces can be further divided into Countably Infinite and Uncountably Infinite sample spaces, but this distinction is typically covered in higher-level probability.
Examples:
- Countably Infinite: Tossing a coin until a Head appears for the first time. The sample space is $S = \{H, TH, TTH, TTTH, TTTTH, \dots\}$. The number of outcomes is infinite but countable (we can list them in a sequence).
- Uncountably Infinite: Measuring the exact lifespan (in hours) of a light bulb. The lifespan can be any non-negative real number up to some maximum value (e.g., $[0, \infty)$ or $[0, L]$). The set of real numbers in an interval is uncountable.

Example 1. A bag contains 1 red ball and 3 blue balls. A ball is drawn at random from the bag.

Identify the random experiment and write its sample space.

Answer:

The random experiment is drawing a ball from the bag. The outcome is uncertain because we do not know which specific ball (red or blue) will be drawn before the experiment.

The possible outcomes are drawing a red ball or drawing a blue ball.

Let R denote the outcome of drawing a red ball and B denote the outcome of drawing a blue ball.

The sample space is the set of all possible outcomes.

$S = \{\text{Red, Blue}\}$ or $S = \{R, B\}$.

In this specific example, even though there are 4 balls, the outcomes are typically classified by colour unless the balls of the same colour are distinguishable (e.g., numbered). Assuming balls of the same colour are indistinguishable for basic sample space definition by type, the sample space is {Red, Blue}. If the balls were distinguishable (say $R_1, B_1, B_2, B_3$), then the sample space would be $S = \{R_1, B_1, B_2, B_3\}$, with $n(S)=4$. For introductory purposes, classifying by type is common unless explicitly stated otherwise. Let's assume they are indistinguishable by colour for the sample space definition as requested by the common representation {Red, Blue}. However, when we calculate probabilities later, the total number of equally likely outcomes will be based on distinguishable items (4 balls).

Event

In the context of probability, an Event is a specific outcome or a collection of outcomes from a random experiment. Mathematically, an event is defined as a subset of the sample space.

Definition of an Event

Let $S$ be the sample space associated with a random experiment. An Event, denoted by $E$, is any subset of $S$. When the outcome of the random experiment is an element belonging to the set $E$, we say that the event $E$ has occurred.

Illustrative Examples:

Experiment: Rolling a standard six-sided die.

The sample space is $S = \{1, 2, 3, 4, 5, 6\}$.

Consider the event $E$: "Getting an even number". The outcomes favorable to this event are the even numbers in the sample space. Thus, $E = \{2, 4, 6\}$. This is a subset of $S$.

Consider the event $F$: "Getting a number greater than 4". The outcomes favorable to $F$ are $\{5, 6\}$. So, $F = \{5, 6\}$, which is also a subset of $S$.

Consider the event $G$: "Getting a 7". There is no outcome of 7 when rolling a standard die. Thus, the set of outcomes favorable to $G$ is empty. $G = \{\} = \emptyset$. The impossible event is a subset of any sample space.

Consider the event $H$: "Getting a number less than 7". All outcomes in $S$ are less than 7. Thus, $H = \{1, 2, 3, 4, 5, 6\} = S$. The sure event is the entire sample space.
Experiment: Tossing two coins simultaneously.

The sample space is $S = \{(H, H), (H, T), (T, H), (T, T)\}$.

Consider the event $A$: "Getting exactly one Head". The outcomes where exactly one coin is a Head are (H, T) and (T, H). So, $A = \{(H, T), (T, H)\}$. This is a subset of $S$.

Consider the event $B$: "Getting at least one Head". This means getting one Head or two Heads. The outcomes are (H, T), (T, H), and (H, H). So, $B = \{(H, H), (H, T), (T, H)\}$. This is a subset of $S$.

Types of Events

Events can be classified based on the number of outcomes they contain or their relationship with other events:

1. Simple Event (or Elementary Event):

An event is called a simple event if it consists of a single outcome from the sample space.

Example: In rolling a die, $\{3\}$ is a simple event (getting a 3). In tossing two coins, $\{(H, T)\}$ is a simple event (getting Head on the first coin and Tail on the second).

2. Compound Event:

An event is called a compound event if it consists of more than one outcome from the sample space. Compound events are formed by combining two or more simple events.

Example: In rolling a die, $\{2, 4, 6\}$ (getting an even number) is a compound event. In tossing two coins, $\{(H, H), (H, T), (T, H)\}$ (getting at least one Head) is a compound event.

3. Impossible Event:

An event that contains no outcomes from the sample space is called an impossible event. It is the empty set, denoted by $\emptyset$.

Example: In rolling a die, getting a number greater than 6 is an impossible event. $E = \{x : x \in S, x > 6\} = \emptyset$.

4. Sure Event (or Certain Event):

An event that contains all possible outcomes of the sample space is called a sure event. It is equal to the sample space $S$ itself.

Example: In rolling a die, getting a number less than 7 is a sure event. $E = \{x : x \in S, x < 7\} = \{1, 2, 3, 4, 5, 6\} = S$.

Classical Definition of Probability

The classical definition of probability is one of the oldest and simplest ways to define the probability of an event. It is applicable when the sample space is finite and all the outcomes are equally likely.

If a random experiment has a finite sample space $S$ with $n(S)$ equally likely outcomes, and $E$ is an event (a subset of $S$) containing $n(E)$ outcomes, then the probability of event $E$ occurring, denoted by $P(E)$, is defined as:

$\text{P}(E) = \frac{\text{Number of outcomes favourable to E}}{\text{Total number of possible outcomes in S}}$

$\text{P}(E) = \frac{n(E)}{n(S)}$

... (i)

Here, "outcomes favourable to E" means the outcomes from the sample space $S$ that satisfy the condition defined by event $E$. These are precisely the elements of the set $E$.

Assumption: This definition heavily relies on the assumption that all outcomes in the sample space are equally likely. This is true for fair coins, fair dice, well-shuffled decks of cards, etc., but not for biased objects or experiments where outcomes have different chances of occurring.

Basic Properties of Probability (based on Classical Definition):

Based on the classical definition, the probability $P(E)$ of any event $E$ from a finite sample space $S$ satisfies the following properties:

Range of Probability: For any event $E$, the probability of $E$ is a non-negative number between 0 and 1, inclusive.

$0 \leq P(E) \leq 1$

Since $E$ is a subset of $S$, the number of outcomes in $E$, $n(E)$, is always greater than or equal to 0 and less than or equal to the total number of outcomes in $S$, $n(S)$.

$0 \leq n(E) \leq n(S)$

(By definition of subset)

Dividing all parts by $n(S)$ (assuming $n(S) > 0$):

$\frac{0}{n(S)} \leq \frac{n(E)}{n(S)} \leq \frac{n(S)}{n(S)}$

$0 \leq P(E) \leq 1$

... (ii)
Probability of the Impossible Event: The probability of the impossible event ($\emptyset$) is 0.

Since $\emptyset$ is the event with no outcomes, $n(\emptyset) = 0$.

$P(\emptyset) = \frac{n(\emptyset)}{n(S)} = \frac{0}{n(S)}$

$P(\emptyset) = 0$

... (iii)
Probability of the Sure Event: The probability of the sure event ($S$) is 1.

The sure event includes all outcomes in the sample space, so $n(S) = n(S)$.

$P(S) = \frac{n(S)}{n(S)}$

$P(S) = 1$

... (iv)
Addition Rule for Mutually Exclusive Events: Two events $E$ and $F$ are said to be mutually exclusive if they cannot occur at the same time in a single trial of the experiment. In set notation, this means their intersection is empty: $E \cap F = \emptyset$. If $E$ and $F$ are mutually exclusive events, then the probability that either $E$ or $F$ occurs is the sum of their individual probabilities.

The event "E or F" corresponds to the union of the sets $E$ and $F$, denoted by $E \cup F$.

If $E \cap F = \emptyset$, then the number of outcomes in $E \cup F$ is the sum of the number of outcomes in $E$ and $F$.

$n(E \cup F) = n(E) + n(F)$

[For disjoint sets E and F]

Dividing by $n(S)$:

$\frac{n(E \cup F)}{n(S)} = \frac{n(E)}{n(S)} + \frac{n(F)}{n(S)}$

$P(E \cup F) = P(E) + P(F)$

... (v)

This rule can be extended to any number of mutually exclusive events. If $E_1, E_2, \dots, E_k$ are pairwise mutually exclusive events, then $P(E_1 \cup E_2 \cup \dots \cup E_k) = P(E_1) + P(E_2) + \dots + P(E_k)$.
General Addition Rule: For any two events $E$ and $F$ (whether mutually exclusive or not), the probability of $E$ or $F$ occurring is given by:

$P(E \cup F) = P(E) + P(F) - P(E \cap F)$

Derivation: The number of elements in the union of two sets is given by $n(E \cup F) = n(E) + n(F) - n(E \cap F)$. This formula subtracts the elements in the intersection $(E \cap F)$ once because they are counted twice (once in $n(E)$ and once in $n(F)$).

Dividing by $n(S)$:

$\frac{n(E \cup F)}{n(S)} = \frac{n(E)}{n(S)} + \frac{n(F)}{n(S)} - \frac{n(E \cap F)}{n(S)}$

$P(E \cup F) = P(E) + P(F) - P(E \cap F)$

... (vi)

Note that if $E$ and $F$ are mutually exclusive, $E \cap F = \emptyset$, so $n(E \cap F) = 0$ and $P(E \cap F) = 0$. In this case, the general rule reduces to the specific rule for mutually exclusive events: $P(E \cup F) = P(E) + P(F)$.
Probability of Complementary Event: The complement of an event $E$, denoted by $E'$ or $E^c$, is the event that $E$ does not occur. The event $E'$ consists of all outcomes in the sample space $S$ that are not in $E$.

Since every outcome is either in $E$ or in $E'$, and $E$ and $E'$ are mutually exclusive (they cannot occur together), their union covers the entire sample space.

$E \cup E' = S$ and $E \cap E' = \emptyset$.

Using the Addition Rule for mutually exclusive events:

$P(E \cup E') = P(E) + P(E')$

Since $E \cup E' = S$, we have $P(E \cup E') = P(S)$. And we know $P(S) = 1$.

$P(S) = P(E) + P(E')$

$1 = P(E) + P(E')$

Rearranging the formula, we get the probability of the complement:

$P(E') = 1 - P(E)$

... (vii)

This property is very useful when it is easier to calculate the probability that an event does not occur than the probability that it does occur.

Example 1. A fair die is rolled. Find the probability of getting an even number.

Answer:

The random experiment is rolling a fair die.

The sample space $S$ consists of all possible outcomes: $S = \{1, 2, 3, 4, 5, 6\}$.

The total number of possible outcomes is $n(S) = 6$. Since the die is fair, all outcomes are equally likely.

Let $E$ be the event of getting an even number. The outcomes favourable to event $E$ are $\{2, 4, 6\}$.

The number of outcomes favourable to $E$ is $n(E) = 3$.

Using the classical definition of probability (Formula (i)):

$\text{P}(E) = \frac{n(E)}{n(S)}$

$\text{P}(E) = \frac{3}{6}$

$\text{P}(E) = \frac{1}{2}$

Thus, the probability of getting an even number is $\frac{1}{2}$ (or $0.5$ or $50\%$).

Example 2. A bag contains 5 red balls and 3 blue balls. A ball is drawn at random from the bag. Find the probability that the ball drawn is blue.

Answer:

The random experiment is drawing a ball from the bag.

The total number of balls in the bag is $5 (\text{Red}) + 3 (\text{Blue}) = 8$.

Assuming the balls are distinguishable (which they are, even if only by colour category and position in the bag, for calculating equally likely outcomes), the sample space $S$ consists of 8 possible outcomes (drawing any one of the 8 balls). The total number of possible outcomes is $n(S) = 8$. Since the ball is drawn at random, each ball is equally likely to be drawn.

Let $B$ be the event that the ball drawn is blue. The outcomes favourable to event $B$ are drawing any of the 3 blue balls.

The number of outcomes favourable to $B$ is $n(B) = 3$.

Using the classical definition of probability (Formula (i)):

$\text{P}(B) = \frac{n(B)}{n(S)}$

$\text{P}(B) = \frac{3}{8}$

The probability of drawing a blue ball is $\frac{3}{8}$.

Conditional Probability

In many real-world situations, the probability of an event occurring can change if we have additional information about another event that has already occurred or is known to be true. This concept is handled by Conditional Probability. It measures the likelihood of an event happening, given that another event has already happened.

Definition of Conditional Probability

Let $A$ and $B$ be two events associated with the same random experiment. The conditional probability of event $A$ occurring, given that event $B$ has already occurred, is denoted by $P(A|B)$. It is read as "the probability of A given B".

The occurrence of event $B$ changes the set of possible outcomes for the experiment. Our new "universe" or effective sample space becomes event $B$. For event $A$ to occur under the condition that $B$ has occurred, the outcome must be in both $A$ and $B$, i.e., in the intersection $A \cap B$.

Assuming a finite sample space $S$ with equally likely outcomes:

The total number of outcomes in the original sample space is $n(S)$.

The number of outcomes favourable to event $B$ is $n(B)$. Since $B$ has occurred, our new sample space has $n(B)$ outcomes.

The number of outcomes favourable to event $A$ within this new sample space (i.e., outcomes that are in $A$ and also in $B$) is $n(A \cap B)$.

So, the conditional probability of $A$ given $B$ can be intuitively expressed as:

$\text{P}(A|B) = \frac{\text{Number of outcomes favourable to A and B}}{\text{Number of outcomes favourable to B}}$

$\text{P}(A|B) = \frac{n(A \cap B)}{n(B)}$

... (i)

To relate this to probabilities, we can divide the numerator and the denominator by the total number of outcomes in the original sample space, $n(S)$ (assuming $n(S) > 0$).

$\text{P}(A|B) = \frac{n(A \cap B)/n(S)}{n(B)/n(S)}$

From the classical definition of probability, $\frac{n(E)}{n(S)} = P(E)$ for any event $E$. Thus, we get the formal definition of conditional probability:

$\text{P}(A|B) = \frac{P(A \cap B)}{P(B)}$

... (ii)

This formula is valid provided that $P(B) > 0$. If $P(B) = 0$, event $B$ is impossible, and the conditional probability $P(A|B)$ is undefined as the conditioning event cannot occur.

Similarly, the conditional probability of event $B$ occurring given that event $A$ has already occurred is:

$\text{P}(B|A) = \frac{P(A \cap B)}{P(A)}$

... (iii)

This is valid provided that $P(A) > 0$.

Multiplication Rule of Probability (Joint Probability)

The definition of conditional probability can be rearranged to find the probability of the intersection of two events, $P(A \cap B)$, which is the probability that both $A$ and $B$ occur. This is also known as the joint probability of $A$ and $B$.

From the definition $P(A|B) = \frac{P(A \cap B)}{P(B)}$, multiplying both sides by $P(B)$ gives:

$\text{P}(A \cap B) = P(B) \times P(A|B)$

... (iv)

Similarly, from $P(B|A) = \frac{P(A \cap B)}{P(A)}$, multiplying both sides by $P(A)$ gives:

$\text{P}(A \cap B) = P(A) \times P(B|A)$

... (v)

These are the Multiplication Rules of Probability. They state that the probability of both $A$ and $B$ occurring is the probability of one event multiplied by the conditional probability of the other event occurring given the first one occurred. This is particularly useful in scenarios involving sequences of events, like drawing cards without replacement or multistage experiments.

Independent and Dependent Events

Conditional probability helps us define the relationship between events: independence or dependence.

Independent Events:

Two events $A$ and $B$ are said to be independent if the occurrence or non-occurrence of one event does not affect the probability of the occurrence of the other event.

Mathematically, $A$ and $B$ are independent if:

$P(A|B) = P(A)$, provided $P(B) > 0$. This means the probability of $A$ happening is the same whether $B$ has happened or not.
$P(B|A) = P(B)$, provided $P(A) > 0$. This means the probability of $B$ happening is the same whether $A$ has happened or not.

An equivalent and often more useful definition for independence, which also applies when $P(A)$ or $P(B)$ is 0, is derived from the multiplication rule. If $P(A|B) = P(A)$, substituting this into $P(A \cap B) = P(B) \times P(A|B)$ gives:

$\text{P}(A \cap B) = P(B) \times P(A)$

... (vi)

So, two events $A$ and $B$ are independent if and only if their joint probability is equal to the product of their individual probabilities. This is the most commonly used test for independence.

Note: Independence should not be confused with mutually exclusive events. Mutually exclusive events ($A \cap B = \emptyset$) cannot happen at the same time. If $A$ and $B$ are mutually exclusive and $P(A) > 0, P(B) > 0$, then $P(A \cap B) = P(\emptyset) = 0$, while $P(A)P(B) > 0$. Thus, mutually exclusive events with non-zero probabilities are always dependent. Independence means the occurrence of one does not affect the likelihood of the other; mutual exclusivity means the occurrence of one makes the other impossible.

Dependent Events:

Two events $A$ and $B$ are said to be dependent if they are not independent. This means the occurrence of one event does affect the probability of the occurrence of the other event.

Mathematically, $A$ and $B$ are dependent if any of the following hold:

$P(A|B) \neq P(A)$ (if $P(B) > 0$)
$P(B|A) \neq P(B)$ (if $P(A) > 0$)
$P(A \cap B) \neq P(A) \times P(B)$

Example 1. A fair die is rolled. Let A be the event of getting an even number, and B be the event of getting a number greater than 3. Find $P(A|B)$. Are A and B independent?

Answer:

The random experiment is rolling a fair die.

The sample space $S = \{1, 2, 3, 4, 5, 6\}$. The total number of outcomes is $n(S) = 6$. Since the die is fair, all outcomes are equally likely.

Event A: Getting an even number. $A = \{2, 4, 6\}$.

Number of outcomes in A is $n(A) = 3$.

Probability of A: $P(A) = \frac{n(A)}{n(S)} = \frac{3}{6} = \frac{1}{2}$.

Event B: Getting a number greater than 3. $B = \{4, 5, 6\}$.

Number of outcomes in B is $n(B) = 3$.

Probability of B: $P(B) = \frac{n(B)}{n(S)} = \frac{3}{6} = \frac{1}{2}$.

Now, let's find the intersection of events A and B, $A \cap B$, which represents the outcomes that are both even and greater than 3.

$A \cap B = \{x \in S \mid x \text{ is even and } x > 3\} = \{4, 6\}$.

Number of outcomes in $A \cap B$ is $n(A \cap B) = 2$.

Probability of $A \cap B$: $P(A \cap B) = \frac{n(A \cap B)}{n(S)} = \frac{2}{6} = \frac{1}{3}$.

We want to find the conditional probability $P(A|B)$, the probability of getting an even number given that the number is greater than 3. Using the formula (ii):

$\text{P}(A|B) = \frac{P(A \cap B)}{P(B)}$

$\text{P}(A|B) = \frac{1/3}{1/2}$

$\text{P}(A|B) = \frac{1}{3} \times \frac{2}{1} = \frac{2}{3}$

Alternatively (using reduced sample space):

Given that event B (getting a number greater than 3) has occurred, our effective sample space is the set of outcomes in B, which is $B = \{4, 5, 6\}$. The size of this reduced sample space is $n(B) = 3$.

Within this reduced sample space $B$, the outcomes that are also in event A (getting an even number) are $\{4, 6\}$. The number of such outcomes is 2.

The probability of A given B is the ratio of favorable outcomes in the reduced sample space to the size of the reduced sample space:

$\text{P}(A|B) = \frac{\text{Number of outcomes in } A \cap B}{\text{Number of outcomes in } B} = \frac{n(A \cap B)}{n(B)}$

$\text{P}(A|B) = \frac{2}{3}$

Both methods yield the same result.

Checking for Independence:

We need to check if $P(A \cap B) = P(A) \times P(B)$.

$P(A) = \frac{1}{2}$

$P(B) = \frac{1}{2}$

$P(A) \times P(B) = \frac{1}{2} \times \frac{1}{2} = \frac{1}{4}$.

$P(A \cap B) = \frac{1}{3}$.

Since $P(A \cap B) = \frac{1}{3} \neq \frac{1}{4} = P(A) \times P(B)$, the events A and B are dependent. The occurrence of event B (getting a number > 3) changes the probability of getting an even number (from $1/2$ to $2/3$).

Example 2. A fair coin is tossed twice. Let A be the event 'Head on the first toss' and B be the event 'Tail on the second toss'. Are A and B independent?

Answer:

The random experiment is tossing a fair coin twice.

The sample space $S = \{(H, H), (H, T), (T, H), (T, T)\}$. The total number of outcomes is $n(S) = 4$. Since the coin is fair and tosses are independent, all outcomes are equally likely.

Event A: Head on the first toss. $A = \{(H, H), (H, T)\}$.

Number of outcomes in A is $n(A) = 2$.

Probability of A: $P(A) = \frac{n(A)}{n(S)} = \frac{2}{4} = \frac{1}{2}$.

Event B: Tail on the second toss. $B = \{(H, T), (T, T)\}$.

Number of outcomes in B is $n(B) = 2$.

Probability of B: $P(B) = \frac{n(B)}{n(S)} = \frac{2}{4} = \frac{1}{2}$.

The intersection of A and B, $A \cap B$, represents getting a Head on the first toss AND a Tail on the second toss.

$A \cap B = \{(H, T)\}$.

Number of outcomes in $A \cap B$ is $n(A \cap B) = 1$.

Probability of $A \cap B$: $P(A \cap B) = \frac{n(A \cap B)}{n(S)} = \frac{1}{4}$.

Checking for Independence:

We need to check if $P(A \cap B) = P(A) \times P(B)$.

$P(A) = \frac{1}{2}$

$P(B) = \frac{1}{2}$

$\text{P}(A) \times P(B) = \frac{1}{2} \times \frac{1}{2} = \frac{1}{4}$

$P(A \cap B) = \frac{1}{4}$.

Since $P(A \cap B) = \frac{1}{4}$ and $P(A)P(B) = \frac{1}{4}$, we have $P(A \cap B) = P(A)P(B)$.

Therefore, the events A and B are independent. This is expected because the outcome of the first coin toss does not influence the outcome of the second coin toss.

Total Probability

In probability theory, the Law of Total Probability is a fundamental rule that expresses the total probability of an outcome which can be realized via several distinct events. It allows us to calculate the probability of an event that occurs under various mutually exclusive and exhaustive conditions.

Partition of the Sample Space

Before we state the Law of Total Probability, we need to understand the concept of a partition of the sample space. A collection of events $E_1, E_2, \dots, E_n$ is said to constitute a partition of the sample space $S$ if they satisfy the following three conditions:

Mutually Exclusive: The events are pairwise disjoint, meaning no two events can occur simultaneously.

$E_i \cap E_j = \emptyset$ for all $i \neq j$, where $i, j \in \{1, 2, \dots, n\}$.

This implies that an outcome belongs to at most one of these events.
Exhaustive: The union of all the events covers the entire sample space. Every possible outcome of the experiment must belong to at least one of these events.

$E_1 \cup E_2 \cup \dots \cup E_n = S$.
Non-zero Probability: Each event in the partition must have a strictly positive probability of occurrence.

$P(E_i) > 0$ for all $i \in \{1, 2, \dots, n\}$.

Essentially, a partition divides the sample space into non-overlapping regions, and taken together, these regions cover the entire sample space. Think of $E_1, E_2, \dots, E_n$ as representing a set of possible initial conditions or scenarios under which some other event $A$ might occur.

Example: In a manufacturing process, items might be produced by different machines (M1, M2, M3), or during different shifts (Shift 1, Shift 2), or using different batches of raw material (Batch A, Batch B). If these categories are mutually exclusive and cover all production, they form a partition of the sample space of produced items.

Law of Total Probability

Let $E_1, E_2, \dots, E_n$ be a partition of the sample space $S$. Let $A$ be any event in $S$. Then the probability of event $A$ is the sum of the probabilities of $A$ occurring under each of the partition events $E_i$. The Law of Total Probability is given by the formula:

$\text{P}(A) = P(A|E_1)P(E_1) + P(A|E_2)P(E_2) + \dots + P(A|E_n)P(E_n)$

... (i)

Using summation notation, this can be written concisely as:

$\text{P}(A) = \sum_{i=1}^{n} P(A|E_i)P(E_i)$

... (ii)

This law effectively breaks down the calculation of $P(A)$ into parts based on which event $E_i$ occurs. It says that the total probability of $A$ is the weighted average of the conditional probabilities of $A$ given each $E_i$, where the weights are the probabilities of the $E_i$'s themselves.

Derivation of the Law of Total Probability:

Let $S$ be the sample space and $E_1, E_2, \dots, E_n$ be a partition of $S$. Let $A$ be any event in $S$.

Since $E_1, E_2, \dots, E_n$ form a partition of $S$, every outcome in $S$ belongs to exactly one of the events $E_i$ (due to being mutually exclusive and exhaustive).

Any event $A$ can be expressed as the union of its intersections with each of the partition events:

$A = (A \cap E_1) \cup (A \cap E_2) \cup \dots \cup (A \cap E_n)$

(Since $A \subseteq S$ and $S = E_1 \cup \dots \cup E_n$)

Furthermore, since the events $E_i$ are mutually exclusive ($E_i \cap E_j = \emptyset$ for $i \neq j$), their intersections with $A$ are also mutually exclusive:

$(A \cap E_i) \cap (A \cap E_j) = A \cap (E_i \cap E_j) = A \cap \emptyset = \emptyset$ for $i \neq j$.

Since the events $(A \cap E_1), (A \cap E_2), \dots, (A \cap E_n)$ are mutually exclusive, we can use the Addition Rule for Mutually Exclusive Events to find the probability of their union (which is $A$):

$\text{P}(A) = P((A \cap E_1) \cup (A \cap E_2) \cup \dots \cup (A \cap E_n))$

$\text{P}(A) = P(A \cap E_1) + P(A \cap E_2) + \dots + P(A \cap E_n)$

... (iii)

From the Multiplication Rule of Probability (derived from conditional probability), we know that for any two events $A$ and $E_i$:

$\text{P}(A \cap E_i) = P(A|E_i)P(E_i)$

[Provided $P(E_i) > 0$, which is a condition for the partition]

Substituting this expression for $P(A \cap E_i)$ into equation (iii) for each $i$:

$\text{P}(A) = P(A|E_1)P(E_1) + P(A|E_2)P(E_2) + \dots + P(A|E_n)P(E_n)$

This completes the derivation of the Law of Total Probability.

Example 1. A factory has two machines, M1 and M2. Machine M1 produces 60% of the total items, and Machine M2 produces 40% of the total items. It is known that 2% of the items produced by Machine M1 are defective, and 3% of the items produced by Machine M2 are defective. A random item is selected from the total production. What is the probability that this selected item is defective?

Answer:

Let $S$ be the sample space of all items produced by the factory.

Let $E_1$ be the event that a randomly selected item is produced by Machine M1.

Let $E_2$ be the event that a randomly selected item is produced by Machine M2.

Since every item is produced by either M1 or M2, and no item is produced by both simultaneously, the events $E_1$ and $E_2$ form a partition of the sample space $S$.

We are given the probabilities of these events:

$\text{P}(E_1) = 60\% = 0.60$

$\text{P}(E_2) = 40\% = 0.40$

Note that $P(E_1) + P(E_2) = 0.60 + 0.40 = 1$, and both probabilities are greater than 0, confirming $E_1$ and $E_2$ form a valid partition.

Let $A$ be the event that the randomly selected item is defective.

We are given the conditional probabilities of an item being defective, given which machine produced it:

$\text{P}(A|E_1) = \text{P}(\text{Defective} \mid \text{Item from M1}) = 2\% = 0.02$

$\text{P}(A|E_2) = \text{P}(\text{Defective} \mid \text{Item from M2}) = 3\% = 0.03$

We want to find $P(A)$, the total probability that a randomly selected item is defective. We can use the Law of Total Probability (Formula (i)):

$\text{P}(A) = P(A|E_1)P(E_1) + P(A|E_2)P(E_2)$

Substitute the given values into the formula:

$\text{P}(A) = (0.02)(0.60) + (0.03)(0.40)$

Calculate the products:

$\text{P}(A) = 0.0120 + 0.0120$

Sum the results:

$\text{P}(A) = 0.0240$

The probability that a randomly selected item from the total production is defective is $0.024$, which is equivalent to $2.4\%$.

This calculation considers the proportion of items produced by each machine and their respective defective rates to find the overall defective rate.

Bayes’ Theorem

Bayes' Theorem is a powerful mathematical formula used to update the probability of a hypothesis based on new evidence. It is particularly significant because it allows us to reverse conditional probabilities. While conditional probability tells us the probability of an effect given a cause ($P(\text{Effect}|\text{Cause})$), Bayes' Theorem helps us find the probability of a cause given an observed effect ($P(\text{Cause}|\text{Effect})$). This is crucial in fields like medical diagnosis, spam filtering, machine learning, and many areas involving inference under uncertainty.

Recap: Partition of the Sample Space

Bayes' Theorem relies on the concept of a partition of the sample space, which we discussed in the context of the Law of Total Probability. A collection of events $E_1, E_2, \dots, E_n$ forms a partition of the sample space $S$ if:

They are mutually exclusive: $E_i \cap E_j = \emptyset$ for $i \neq j$. This means only one of these events can occur in a single trial of the experiment.
They are exhaustive: $E_1 \cup E_2 \cup \dots \cup E_n = S$. This means at least one of these events must occur in any trial. Together, mutually exclusive and exhaustive imply exactly one event from the partition occurs.
Each event has a non-zero probability: $P(E_i) > 0$ for all $i = 1, 2, \dots, n$.

The events $E_i$ can be thought of as distinct possible scenarios or "causes" that can lead to a particular "effect" event $A$.

Statement of Bayes' Theorem

Let $E_1, E_2, \dots, E_n$ be a set of $n$ mutually exclusive and exhaustive events (i.e., a partition) of the sample space $S$, such that $P(E_i) > 0$ for all $i = 1, 2, \dots, n$. Let $A$ be any event associated with $S$ such that $P(A) > 0$.

Then, for any specific event $E_i$ from the partition, the conditional probability of $E_i$ occurring given that event $A$ has occurred, $P(E_i|A)$, is given by Bayes' Theorem:

$\text{P}(E_i|A) = \frac{P(A|E_i)P(E_i)}{P(A)}$

... (i)

The denominator, $P(A)$, can be calculated using the Law of Total Probability: $P(A) = \sum_{j=1}^{n} P(A|E_j)P(E_j)$. Substituting this into the formula, we get the expanded form of Bayes' Theorem:

$\text{P}(E_i|A) = \frac{P(A|E_i)P(E_i)}{\sum_{j=1}^{n} P(A|E_j)P(E_j)}$

... (ii)

Interpretation of Terms:

$P(E_i)$ (Prior Probability): This is the initial probability of event $E_i$ before we observe any information about event $A$. It represents our prior belief in the likelihood of the "cause" $E_i$.
$P(A|E_i)$ (Likelihood): This is the probability of observing the evidence/effect $A$ given that a specific "cause" $E_i$ has occurred. It measures how well event $E_i$ explains the observed data $A$.
$P(A)$ (Total Probability of Evidence): This is the overall probability of the evidence/effect $A$ occurring, regardless of which event $E_i$ caused it. It acts as a normalizing constant in Bayes' Theorem. As shown by the Law of Total Probability, $P(A) = P(A|E_1)P(E_1) + P(A|E_2)P(E_2) + \dots + P(A|E_n)P(E_n)$.
$P(E_i|A)$ (Posterior Probability): This is the updated probability of event $E_i$ after observing the evidence/effect $A$. It represents our revised belief in the likelihood of the "cause" $E_i$ given the observed data. Bayes' Theorem provides a systematic way to move from a prior probability $P(E_i)$ to a posterior probability $P(E_i|A)$ using the observed evidence $A$.

Derivation of Bayes' Theorem:

Bayes' Theorem is a direct consequence of the definition of conditional probability and the multiplication rule.

Recall the definition of conditional probability for events $E_i$ and $A$ (assuming $P(A) > 0$ and $P(E_i) > 0$):

$\text{P}(E_i|A) = \frac{P(E_i \cap A)}{P(A)}$

... (iii)

Also recall the Multiplication Rule of Probability, which states that the probability of the intersection of two events can be written in terms of conditional probability:

$\text{P}(E_i \cap A) = P(A \cap E_i) = P(A|E_i)P(E_i)$

... (iv)

Now, substitute the expression for $P(E_i \cap A)$ from equation (iv) into the numerator of the conditional probability formula in equation (iii):

$\text{P}(E_i|A) = \frac{P(A|E_i)P(E_i)}{P(A)}$

This is the basic form of Bayes' Theorem (Equation (i)). The denominator $P(A)$ represents the total probability of event $A$. If $E_1, E_2, \dots, E_n$ form a partition of the sample space, then by the Law of Total Probability:

$\text{P}(A) = \sum_{j=1}^{n} P(A|E_j)P(E_j)$

... (v)

Substituting equation (v) into the denominator of equation (i), we obtain the expanded form of Bayes' Theorem (Equation (ii)):

$\text{P}(E_i|A) = \frac{P(A|E_i)P(E_i)}{\sum_{j=1}^{n} P(A|E_j)P(E_j)}$

The derivation is straightforward, but the implications of the theorem are profound as they provide a formal mechanism for updating probabilities based on new evidence.

Example 1. A factory has two machines, M1 and M2. Machine M1 produces 60% of the total items, and Machine M2 produces 40% of the total items. It is known that 2% of the items produced by Machine M1 are defective, and 3% of the items produced by Machine M2 are defective. A random item is selected from the total production and is found to be defective. What is the probability that this defective item was produced by Machine M1?

Answer:

Let $S$ be the sample space of all items produced.

Let $E_1$ be the event that an item is produced by Machine M1.

Let $E_2$ be the event that an item is produced by Machine M2.

$E_1$ and $E_2$ form a partition of $S$ as every item comes from exactly one of the machines.

We are given the following prior probabilities:

$\text{P}(E_1) = 60\% = 0.60$

(Prior probability of item from M1)

$\text{P}(E_2) = 40\% = 0.40$

(Prior probability of item from M2)

Let $A$ be the event that the selected item is defective.

We are given the conditional probabilities (likelihoods) of an item being defective given the machine:

$\text{P}(A|E_1) = \text{P}(\text{Defective} \mid \text{Item from M1}) = 2\% = 0.02$

(Likelihood of defective given M1)

$\text{P}(A|E_2) = \text{P}(\text{Defective} \mid \text{Item from M2}) = 3\% = 0.03$

(Likelihood of defective given M2)

We want to find the probability that the item was produced by Machine M1, given that it is defective. This is the posterior probability $P(E_1|A)$.

Using Bayes' Theorem (Formula (i)):

$\text{P}(E_1|A) = \frac{P(A|E_1)P(E_1)}{P(A)}$

[Bayes' Theorem]

First, we need to calculate $P(A)$, the total probability of event $A$ (getting a defective item), using the Law of Total Probability:

$\text{P}(A) = P(A|E_1)P(E_1) + P(A|E_2)P(E_2)$

[Law of Total Probability]

Substitute the known values:

$\text{P}(A) = (0.02)(0.60) + (0.03)(0.40)$

$\text{P}(A) = 0.0120 + 0.0120$

$\text{P}(A) = 0.0240$

[Total Probability of Defective Item]

Now, substitute this value of $P(A)$ back into the Bayes' Theorem formula for $P(E_1|A)$:

$\text{P}(E_1|A) = \frac{(0.02)(0.60)}{0.0240}$

$\text{P}(E_1|A) = \frac{0.0120}{0.0240}$

Simplify the fraction:

$\text{P}(E_1|A) = \frac{120}{240} = \frac{\cancel{120}^{1}}{\cancel{240}_{2}} = \frac{1}{2}$

$\text{P}(E_1|A) = 0.5$

[Posterior Probability of item from M1 given Defective]

Thus, if a randomly selected item is found to be defective, the probability that it was produced by Machine M1 is $0.5$ or $50\%$.

Calculation for P(E_2|A):

We can also find the probability that the defective item was produced by Machine M2, $P(E_2|A)$, using Bayes' Theorem:

$\text{P}(E_2|A) = \frac{P(A|E_2)P(E_2)}{P(A)}$

[Bayes' Theorem]

$\text{P}(E_2|A) = \frac{(0.03)(0.40)}{0.0240}$

$\text{P}(E_2|A) = \frac{0.0120}{0.0240}$

$\text{P}(E_2|A) = \frac{1}{2} = 0.5$

[Posterior Probability of item from M2 given Defective]

Note that $P(E_1|A) + P(E_2|A) = 0.5 + 0.5 = 1$, which is expected since a defective item must have originated from either M1 or M2.

Although Machine M2 has a higher defective rate ($3\%$ vs $2\%$), Machine M1 produces a larger share of the total items ($60\%$ vs $40\%$). Bayes' Theorem correctly weighs these factors. Out of the total defective items ($2.4\%$ of production), half come from M1 ($0.0120 / 0.0240 = 0.5$) and half come from M2 ($0.0120 / 0.0240 = 0.5$).